Flexible Analog Search with Kernel PCA Embedded Molecule Vectors
نویسندگان
چکیده
Studying analog series to find structural transformations that enhance the activity and ADME properties of lead compounds is an important part of drug development. Matched molecular pair (MMP) search is a powerful tool for analog analysis that imitates researchers' ability to select pairs of compounds that differ only by small well-defined transformations. Abstraction is a challenge for existing MMP search algorithms, which can result in the omission of relevant, inexact MMPs, and inclusion of irrelevant, contextually dissimilar MMPs. In this work, we present a new method for MMP search that returns approximate results and enables flexible control over abstraction of contextual information. We illustrate the concepts and mechanics of our method with a series of exemplar MMP queries, and then benchmark search accuracy using MMPs found by fragment indexing. We show that we can search for MMPs in a context dependent manner, and accurately approximate context independent fragment index based MMP search over a range of fingerprint and dataset conditions. Our method can be used to search for pairwise correspondences among analog sets and bolster MMP datasets where data is missing or incomplete.
منابع مشابه
Recognizing Faces using Kernel Eigenfaces and Support Vector Machines
In face recognition, Principal Component Analysis (PCA) is often used to extract a low dimensional face representation based on the eigenvector of the face image autocorrelation matrix. Kernel Principal Component Analysis (Kernel PCA) has recently been proposed as a non-linear extension of PCA. While PCA is able to discover and represent linearly embedded manifolds, Kernel PCA can extract low d...
متن کاملSupport Vector Machine Approximation using Kernel PCA
Support Vector Machine is a very important technique used for classification and regression. Although very accurate, the speed of SVM classification decreases with increase in the number of support vectors. This paper describes one method of reducing the number of support vectors through the application of Kernel PCA. This method is different from other proposed methods as we show that the exac...
متن کاملMassively Parallel Mixed-Signal VLSI Kernel Machines
Recently it has been shown that a simple learning paradigm, the support vector machine (SVM), outperforms some of the most elaborately tuned expert systems and neural networks in object recognition tasks. In run-time, the SVM operates by computing a kernelbased distance between the object’s vector at the input and a set of support vectors selected from the training set, and weighting the result...
متن کاملSpeedup of kernel eigenvoice speaker adaptation by embedded kernel PCA
Recently, we proposed an improvement to the eigenvoice (EV) speaker adaptation called kernel eigenvoice (KEV) speaker adaptation. In KEV adaptation, eigenvoices are computed using kernel PCA, and a new speaker’s adapted model is implicitly computed in the kernel-induced feature space. Due to many online kernel evaluations, both adaptation and subsequent recognition of KEV adaptation are slower ...
متن کاملA modified concept of PCA to reduce the classification error using kernel SVM classifier
This paper focuses on the mathematical technique PCA with the drawback of its mixing of data pixel. We have extracted principal directions of the covariance ellipse as done in PCA, but we will not blindly take the Eigen vectors corresponding to k largest values. Instead, we transform the data vectors into the new n– dimensional (n is dimension of old input space) vector space spanned by the Eig...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 15 شماره
صفحات -
تاریخ انتشار 2017